The Problem of Induction and Machine Learning

نویسنده

  • Francesco Bergadano
چکیده

Are we justified in inferring a general rule from observations that frequently confirm it? This is the usual statement of the problem of induction. The present paper argues that this question is relevant for the understanding of Machine Learning, but insufficient. Research in Machine Learning has prompted another, more fundamental question: the number of possible rules grows exponentially wi th the size of the examples, and many of them are somehow confirmed by the data how are we to choose effectively some rules that have good chances of being predictive? We analyze if and how this problem is approached in standard accounts of induction and show the difficulties that are present. Finally, we suggest that the Explanation-based Learning approach and related methods of knowledge intensive induction could be a partial solution to some of these problems, and help understanding the question of valid induction from a new perspective. 1 The t r a d i t i o n a l p r o b l e m of i nduc t i on Induction seems to escape all deductive explanations, because its conclusions cannot be proved to be correct. Worse than this, it is not even possible to prove that they are correct most of the time, unless we are ready to accept very elaborate and questionable premises. Many conclusions obtained by an inductive process are totally wrong, although infinitely many examples confirm them. Some actually get worse as more confirming evidence is found. The philosophical literature is full of such examples; for instance, let me paraphrase a bit the well known argument of Goodman [Goodman, 1954]. Suppose we define a learning system of "unexpected value' as a system that performs quite badly unti l August 30, 1991, and then starts to produce incredibly good results. If one was to believe blindly in the power of induction, then an L7CA1-91 paper describing all kinds of very poor results and emphasizing how badly their system works would thus confirm in many ways that the system is of *I am grateful to Paola Dessi', Stuart Russell and Lorenza Saitta for helpful comments on a draft version "unexpected value". The more and the more varied the confirming examples that are possible before the IJCA1 conference, the worse the conclusion seems to follow. In this paper, we analyze the problem of induction in a computational framework, where it is possible to make clear the assumptions that we could rely upon when we (or computers) infer general rules that are justified only by a finite number of confirming examples. When the scope of the enquiry is so restricted, one of the most authoritative approaches to the problem is statistical estimation, as developed, for example, by Neyman. This theory is very well known, but the following wil l make our later discussion clearer. Suppose, for example, that we are to estimate the mean of a given population, that we know to be normal and with standard deviation . Let us observe a sample having mean Then, because of the properties of the normal distribution, we may say that, with probability 0.95

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-stage fuzzy-stochastic programming for parallel machine scheduling problem with machine deterioration and operator learning effect

This paper deals with the determination of machine numbers and production schedules in manufacturing environments. In this line, a two-stage fuzzy stochastic programming model is discussed with fuzzy processing times where both deterioration and learning effects are evaluated simultaneously. The first stage focuses on the type and number of machines in order to minimize the total costs associat...

متن کامل

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

Two meta-heuristic algorithms for parallel machines scheduling problem with past-sequence-dependent setup times and effects of deterioration and learning

This paper considers identical parallel machines scheduling problem with past-sequence-dependent setup times, deteriorating jobs and learning effects, in which the actual processing time of a job on each machine is given as a function of the processing times of the jobs already processed and its scheduled position on the corresponding machine. In addition, the setup time of a job on each machin...

متن کامل

Fuzzy Multi-objective Permutation Flow Shop Scheduling Problem with Fuzzy Processing Times under Learning and Aging Effects

In industries machine maintenance is used in order to avoid untimely machine fails as well as to improve production effectiveness. This research regards a permutation flow shop scheduling problem with aging and learning effects considering maintenance process. In this study, it is assumed that each machine may be subject to at most one maintenance activity during the planning horizon. The objec...

متن کامل

Emotion Detection in Persian Text; A Machine Learning Model

This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...

متن کامل

Time series forecasting of Bitcoin price based on ARIMA and machine learning approaches

Bitcoin as the current leader in cryptocurrencies is a new asset class receiving significant attention in the financial and investment community and presents an interesting time series prediction problem. In this paper, some forecasting models based on classical like ARIMA and machine learning approaches including Kriging, Artificial Neural Network (ANN), Bayesian method, Support Vector Machine...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991